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ABSTRACT 

At  present ,  software  for  formula  translation  is  used  routinely  in  numerical 
computation.  On  the  other  hand,  although  software  for  differentiation  of 
formulas  is  easy  to  construct  and  widely  available,  many  numerical  analysts 
seem  to  be  unaware  of  its  existence  and  potential  for  numerical  computation. 

A  simple  procedure  for  formula  translation  and  differentiation  will  be  described, 
and  some  significant  applications  will  be  indicated.  In  ordinary  computation, 
these  include  solution  of  ordinary  and  partial  differential  equations  by  series 
methods  (Taylor  and  Lie  series,  for  example),  solution  of  nonlinear  systems  of 
equations  by  Newton's  method  and  its  variants,  and  nonlinear  optimization 
(constrained  and  unconstrained).  Together  with  interval  analysis,  differentia¬ 
tion  can  be  used  to  determine  properties  of  functions  and  thus  automate  the 
application  of  certain  theorems,  such  as  ones  for  existence  of  fixed  points 
of  n-d imensional  operators  or  solutions  of  nonlinear  systems  of  equations. 
Another  large  field  of  application  of  differentiation  is  to  automatic  error 
analysis,  either  using  the  graph  structure  of  the  computation,  or  interval 
analysis.  An  example  is  given  of  a  program  for  numerical  integration  in  which 
automatic  differentiation  and  interval  analysis  are  combined  to  provide  a 
priori  and  a  posteriori  error  bounds  for  the  results. 

/V 

AMS  (MOS)  Subject  Classifications:  65D30,  65G05,  6SG10,  65H10,  65K10,  6SL05, 

6SV05 ,  68-04,  68C05,  68C20 

Key  Words:  Automatic  differentiation,  Taylor  coefficient  generation.  Error 

estimation.  Differential  equations.  Nonlinear  systems.  Optimization, 
Numerical  integration.  Interval  analysis 

Work  Unit  Numbers  7  (Numerical  Analysis) 

8  (Computer  Science) 


Research  sponsored  by  the  United  States  Army  under  Contract  No.  DAAG29-75-C-0024 . 


SIGNIFICANCE  AND  EXPLANATION 

Money  an  be  saved  by  turning  routine  clerical  tasks  over  to  electronic 
computers.  Such  work  includes  many  of  the  details  of  computer  programming 
itself,  and  a  number  of  high  level  languages  have  been  introduced  to  assist  with 
the  writing  of  software.  One  feature  of  these  languages  is  formula  transla¬ 
tion,  which  prepares  subroutines  to  evaluate  functions  written  in  a  notation 
similar  to  that  used  in  ordinary  mathematics.  Many  scientific  and  engineering 
alculations  require  derivatives  of  functions  or  coefficients  of  jxiwer  series 
expansions  of  functions.  The  formulation  of  subroutines  to  evaluate  derivatives 
of  functions  or  Taylor  coefficients  can  be  performed  by  a  computer  in  much  the 
same-  manner  as  formula  translation.  Software  for  these  purposes  has  been 
developed  at  the  Mathematics  Research  Center  over  the  past  IS  years,  and  has 
made  the  computer  more  useful. 

This  report  describes  techniques  of  automatic  differentiation  by  list 
processing,  and  the  generation  of  Taylor  coefficients  by  the  use  of  recurrence 
relations.  A  number  of  possible  applications  are  mentioned,  including  the  solu¬ 
tion  of  differential  equations,  nonlinear  systems  of  equations,  constrained  and 
unconstrained  optimization  problems,  and  various  methods  for  error  estimation. 
Some  of  these  have  been  implemented  by  actual  software,  and  an  example  is 
given  of  automatic  error  estimation  for  a  numerical  integration.  In  this  latter 
eximple,  differentiation  is  used  in  combination  with  interval  arithmetic.  The 
u  •  *  of  derivatives  increases  the  applicability  of  interval  arithmetic  consider¬ 
ably,  as  explained  in  this  report.  However,  unlike  interval  arithmetic,  software 
for  differentiation  can  be  written  entirely  in  a  high  level  language  such  as 
FORTRAN,  and  thus  is  easily  portable  between  installations.  As  indicated  in  this 
report,  the  use  of  this  software  where  appropriate  can  result  in  savings  of  time 
and  effort,  as  well  as  a  more  effective  approach  to  certain  problems. 

The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
uranary  lies  with  MRC ,  and  not  with  the  author  of  this  report. 


APPLICATIONS  OF  SOFTWARE  FOR  AUTOMATIC  D I  FF  ERENT I  AT  ION 


IN  NUMERICAL  COMPUTATION 
L.  B.  Rail 


1.  Introduction .  At  the  foundations  of  numerical  computation  lie  the  means  by 
which  calculations  are  actually  done.  Today,  this  means  not  or.ly  the  computing 
machines  (hardware),  but  also  the  programs  (software)  available  to  the  user. 
This  paper  describes  software  for  differentiation  of  functions  and  some  of 

its  applications.  Differentiation  is  somewhat  similar  to  washing  dishes, 
a  dull,  uninteresting  task  best  left  to  a  machine.  Two  methods  are  given  for 
automatic  differentiation,  and  their  use  in  various  programs  for  applications 
is  described.  As  the  discussion  is  limited  to  actual  experience  at  the 
Mathematics  Research  Center  over  the  past  15  years,  no  mention  is  made  of  much 
fine  software  which  has  doubtless  been  developed  elsewhere. 

At  the  end  of  the  paper,  some  conclusions  concerning  the  past  usefulness 
of  automatic  differentiation  will  be  given,  together  with  some  predictions  of 
possible  future  trends. 

2.  Differentiation  by  list  processing.  An  approach  to  formula  translation  and 
differentiation  which  is  conceptually  simple  is  provided  by  a  list  processing 
technique.  Rather  than  use  a  general  purpose  list  processing  language,  how¬ 
ever,  it  was  considered  more  appropriate  to  develop  software  for  the  specific 
purpose  of  evaluating  and  differentiating  formulas.  This  technique,  described 
in  the  book  by  Rail  (35,  pp.  155-160],  forms  the  basis  of  the  program  written 
in  1065  by  Reiter  ( 40 J  and  later  extended  by  Gray  and  Reiter  (15)  and 

Wertz  (45,46],  the  latter  assisted  by  T.  Ladner. 

The  principles  are  best  illustrated  by  an  example  (35,  pp.  155-156). 
Suppose  that  one  wishes  to  evaluate  the  function 

(2.1)  f(x,y)  •  (xy  ♦  sin  x  ♦  4)  (3y2  ♦  6) 
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and  some  of  its  derivatives.  The  first  step  is  to  decode  the  formula 


(2.2)  F  *  (X.Y  ♦  SIN (X)  +  4) • (3*Y**2  +  6) 

corresponding  to  (2.1)  into  a  sequence  of  arithmetic  operations  and  calls  to 
subroutines.  This  gives  the  code  list 


(2.3) 


T1  «  X*Y 
T2  -  SIN(X) 
T3  «  T1  +  T2 
T4  »  T3  +  4 
T5  »  Y«*2 
T6  •  3*TS 
T7  •  T6  ♦  6 
F  -  T4«T7  . 


Note  that  the  code  list  is  itself  a  sequence  of  statements  in  the  same  language 
in  which  (2.2)  is  written,  and  hence  can  be  translated  by  the  same  compiler 
into  machine  language.  Now,  to  differentiate  (2.3),  the  program  refers  to  a 
dictionary  containing  the  derivatives  of  the  arithmetic  operations  and  sub¬ 
routines  called.  For  example,  the  entry  corresponding  to 


(2.4)  U2  -  SIN(Ul) 
would  be 

(2.5)  VI  -  COS(Ul) 

DU 2  »  VI *DU1 


and  so  forth.  Applying  this  procedure  to  (2.3),  one  obtains  the  code  list  for 
the  differential  of  F, 


Accession  For 

gkui~ 

K>C  TAB 

Unannounced 

Jur;tii  icatlon 


In  order  to  obtain  the  code  list  for  3f/3x.  for  example,  one  sets  DX  -  1 , 
DY  0  in  (2.6)  and  obtains,  after  elimination  of  trivialities  (multiplica¬ 
tions  by  1  or  0,  exponentiation  to  the  first  power,  addition  of  0) ,  the  list 


DXT1  -  Y 


DXT2  -  COS(X) 


DXT4  -  DXT1  ♦  DXT2 


DXF  -  DXT4  »T7 


In  this  case,  the  list  for  the  derivative  DXF  is  simpler  than  the  list  for  F 
due  to  the  polynomial  terms  in  f(x,y).  For  nonpolynomial  functions,  one  would 
expect  differentiation  to  yield  more  complex  lists;  however,  polynomials 
are  not  at  all  uncommon  in  the  modelling  and  analysis  of  nonlinear  problems. 

The  most  important  feature  of  differentiation  by  list  processing  is  that 
the  output  (2.7)  is  a  list  of  exactly  the  same  form  as  the  input  list  (2.3), 
and  -an  itself  be  differentiated  with  respect  to  any  variable  entering  into 
it,  simply  by  invokinq  the  same  processor.  Also,  if  one  needs  only  some  high 
d<  i lvative  of  a  function,  the  lists  for  the  intermediate  derivatives  can  be 
discarded  after  they  are  processed,  or  at  least  it  is  not  necessary  to 

lie  them  into  machine  lanquaqc.  This  can  result  in  a  considerable  saving 
of  computer  time  and  storage  locations. 


It  is  instructive  to  view  the  procedure  described  above  in  terms  of  the 
Kantorovich  graph  117)  which  is  equivalent  to  the  code  list.  Figure  2.1 
shows  the  graph  corresponding  to  the  list  (2.3),  Figure  2.2  the  graph  for 
computing  both  F  and  its  partial  derivative  DXF  with  respect  to  X. 

To  obtain  a  graph  for  the  computation  of  DXF  only,  the  nodes  indicated  by 
squares  in  Figure  2.2  are  unnecessary,  and  can  be  eliminated. 

3.  Automatic  generation  of  Taylor  coefficients.  This  is  the  approach  adopted 
by  Moore  et  al.  in  1964  (30]  in  connection  with  the  solution  of  ordinary 
differential  equations  by  series,  and  was  implemented  by  Reiter  in  1965  (38] 
and  1967  (41)  as  an  independent  program.  Recursive  generation  of  Taylor 
coefficients  is  also  incorporated  in  the  more  general  program  differentiation 
software  developed  by  Kedem  [18],  a  system  which  is  also  capable  of  supervising 
the  list  processing  approach.  To  illustrate  this  method,  the  notation 


(3.1) 


1.2, 


-j 


will  be  used  for  the  first  j  +  1  Taylor  coefficients  of  the  function  £ (x) 
at  x  ■  xQ.  (In  case  f  depends  on  several  variables,  the  derivatives  in 
(3.1)  are  understood  to  be  partial  with  respect  to  the  variable  of  interest.) 
The  recursion  relations  for  generating  the  successive  Taylor  coefficients 
for  the  arithmetic  operations  and  various  functions  are  well  known  (27, 
pp.  107-118;  19].  For  example. 


(3.2) 


<fg) 


i  ftVt' 


j  *  0,1,2 . 


T-0 


These  relationships  can  be  used  to  process  the  code  list  for  the  evaluation 
of  a  function  in  the  same  way  as  the  formulas  for  differentials  were  used  m 
the  previous  section. 

To  interpret  this  method  in  terms  of  the  Kantorovich  graph,  suppose  that, 
instead  of  simply  values  of  terms,  a  vector  consisting  of  the  first  j  *  1 
Taylor  coefficients  is  received  by  each  node  along  the  edges  coming  into  »t 
from  above.  This  vector  is  then  processed  at  the  node  by  invoking  the 
corresponding  recurrence  relations,  and  the  resulting  vector  is  then 
transmitted  to  the  lower  nodes  connected  to  it.  Using  the  graph  in  Figure  2  i, 
for  example,  one  obtains  the  Taylor  coefficients  (3.1)  by  setting 


The  related  technique  of  the  Lie-series  method  for  the  numerical  solu¬ 
tion  of  differential  equations  also  makes  use  of  the  automatic  qeneration 
of  Taylor  coefficients,  as  described  by  Knapp  and  Wanner  [19J.  The  original 
program  120],  written  in  1968,  required  the  user  to  write  the  sequence  of 
subroutine  calls  corresponding  to  a  code  list  of  the  form  (2.3)  for  the 
integrand.  In  1969,  automatic  generation  of  the  code  list  was  added  by 
T.  Szymanski ,  under  the  direction  of  Julia  Gray,  to  obtain  an  improved  program 
(21).  Once  again,  the  result  is  apparently  competitive  with  Runge-Kutta 
methods  of  the  same  order  (19,37]. 

b.  Solution  of  nonlinear  systems  of  equations.  A  popular  and  effective 
method  for  the  solution  of  systems  of  n  nonlinear  equations  in  n  unknowns, 

(4.1)  f.(x  ,x  ,...,x  )  -  0,  i  -  1,2, ... ,n  , 

l  1  2  n 


is  the  Newton-Kantorovich  algorithm  116,35].  However,  to  implement  this 

method  in  such  a  way  as  to  obtain  quadratic  convergence,  one  must  calculate 
2 

the  n*  partial  derivatives 


(4.2) 


i,j  -  1,2, 


to  obtain  the  n  *  n  Jacobian  matrix 


,n  , 


(4.3) 


J  -  f*(x)  -  (f*  )  . 


Tins  is  done  by  automatic  differentiation  in  the  1967  program  of  Gray  and 
Rail  (10]  and  the  1972  version  prepared  by  Kuba  and  Rail  (231  and  described 
in  [11,35]. 

Higher  order  methods,  such  as  Chebyshev's  method  and  the  inversion  of 
power  series  would  require  computation  of  the  Hessian  operator 


(4.4) 


f"(x)  - 


32f  i 
VxjTxk 


and  perhaps  higher  derivatives  (35).  The  automatic  formation  of  (4.4)  is 
implemented  in  the  programs  (10,23)  in  connection  with  verification  of 
existence  of  solutions  of  (4.1)  and  error  estimation. 

Another  approach  to  the  solution  of  the  system  (4.1)  using  derivative* 
is  a  continuation  or  homotopy  method  based  on  the  conversion  of  (4.1)  into  a 


-8- 


system  of  differential  equations  for  a  curve  connecting  an  initial  approxima- 
tion  xo  for  which  f (x  ^)  *  y  to  a  solution  x  .  Such  a  system  is,  using 

vector  notation 


(4.5) 


-y  . 


with  the  initial  conditions  x(0)  *  xQ.  Integrating  (4.5)  by  some  curve¬ 
following  technique  ( 24 )  from  t  -  0  to  t  -  1  will  give  an  approximation 
to  x ( 1 )  «  x  .  The  system  (4.5)  can  be  constructed  using  software  for 
di f  f erentiation. 

c.  Optimization  of  nonlinear  functionals.  A  problem  arising  frequently 
in  the  applications  of  mathematics  is  to  optimize  (maximize  or  minimize) 
the  value  of  a  nonlinear  functional 


(4.6) 


g  -  g (x, ,x  . . . , ,x  ) 
1  i  n 


With  sufficient  smoothness,  this  can  be  converted  into  the  problem  of  solving 
the  system  (4.1)  by  taking  the  gradient 


(4.7) 


Vg 


2a.  ia_ 

3X1  '  3*2 


ia_ 

3x 

n 


and  setting  Vg  ■  0, 


which  is  (4.1)  with  f  .  (x,  ,x_ , . . .  ,x  )  ■  3g/3x. , 

112  n  l 


1,2, 


,n.  If  one  wants  to  solve  this  system  by  a  method  of  Newton  type, 
th*-n  the  Hessian  matrix  O^g/Sx^Sx.)  is  required,  which  is  simply  the 
Jacobian  (4.3).  In  problems  of  this  type,  which  are  called  unconstrained 
optimization  problems,  the  formation  of  the  gradient  and  the  Hessian  can  be 
automated  by  the  use  of  software  for  differentiation. 

Another  type  of  optimization  problem  requires  the  satisfaction  of  condi¬ 
tions 


(4.8) 


hT  (X1 ,X2' 


'V 


0, 


1,2,  ...,k 


which  are  called  constraints .  Given  enough  smoothness,  the  constrained 
optimization  problem  can  be  reduced  to  the  solution  of  a  system  of  equations 
of  the  form  (4.1)  by  the  introduction  of  new  unknowns  X  .  Aj  • . . . # # 
called  Lagrange  multipliers.  One  obtains  the  nonlinear  equations 


(4.9) 


ia_  +  I 

3xi  til 


3h 

‘xs: 


0, 


1.2, 
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the  sanu-  fun.  tio  in 
malignant.  In  order 
software  or,  perhaps 
rather  than  ordinary 


lei  to  pick  one  in  which  roundoff  error  is  the  least 
to  automate  this  process,  one  can  use  the  existing 
better,  write  a  list  processor  which  produces  relative 
differentials.  For  example,  for 


(5.6) 


U3  -  U1 »U2  , 


the  dictionary  entry  for  the  relative  differential  would  be 


(5.7) 


RHJU3  -  RHOU1  ♦  KHOU2  , 


and  so  on . 

Another  direct  application  of  automatic  differentiation  to  roundoff 
ei rot  analysis  would  be  to  automate  the  linearization  technique  due  to 

Stun® el  |44). 


•  -  -'put  at  i-n.i  1  verification  of  existence  of  solutions  of  systems  of  equa- 
..using  intiTVjj  analysis  and  automatic  differentiation.  Interval 
analyst  .127,281  provides  a  method  for  obtaining  bounds  for  the  ranges 
of  ompu'Able  functions,  including  perturbations  due  to  roundoff  error  in  the 
a  tual  omputations. .  Software  for  computing  with  interval  arithmetic  was 
developed  side-by-side  with  differentiation  software  by  Reiter  [39,42], 
starting  in  1965.  A  modern  j^ackage  has  been  prepared  by  Yohe  [47).  Unlike 
the  programs  for  differentiation,  which  can  be  written  in  a  high  level 
aaguage,  interval  software  was  originally  highly  machine  dependent,  including 
li.pu*  and  output  [2? .  Using  advances  in  computer  science,  Crary  and  Ladner 
[9),  ?rary  (6,7,8),  and  Yohe  (48,49)  have  been  able  to  reduce  this  dependence 
to  a  minimum,  at  the  cost  of  speed. 

For  the  present  purposes,  it  is  adequate  to  know  that  one  can  extend 
nu»e: ical  computations  from  ordinary  numbers  x  to  closed  intervals 
X  ■  |a,b)  (wnich  include  numbers  if  a  *  b) .  Execution  in  interval 
arithmetic  of  a  program  for  the  calculation  of  a  function  f (x)  will  result 
in  an  interval  extension  F(X)  of  f(x)  such  that 

(6.1)  f  (x)  r  F (X)  ,  x  t  X  . 

Thus,  for  X  [ a , b ] J  »  max(|a|,  bj),  one  will  have 

(6.2)  |f  (x)  |  <  |F(X)  |  ,  x  t  X  . 

These  results  extend  to  finite  dimensional  spaces,  using  the  maximum  absolute 


11 


component  norm  for  vectors,  and  the  corresponding  norm  for  matrices.  In 
r”,  the  closed  ball  with  center  y  and  radius  p  can  be  represented  as  an 
interval , 

(6.3)  U(y,P)  ■  lx  :  ||x  -  y  || ^  <  p}  ■  ly  -  pe,  y  *  pel  . 
where  e  -  (1,1,. ..,1). 

An  immediate  application  of  these  techniques  is  to  automate  some 
theorems  from  classical  and  interval  analysis  on  the  existence  of  solutions 
of  systems  of  equations  (4.1)  and  to  obtain  error  bounds. 

a.  The  contraction  mapping  theorem.  Here,  one  transforms  the  system 
(4.1),  written  in  vector  form  as  f(x)  -  0,  as  a  fixed  point  problem 

(6.4)  x  ■  P (x) 

for  some  operator  P,  for  example,  to  use  an  iteration  method 

“•5’  Vi  -  "•<>•>•1 . 

starting  from  some  initial  approximation  x  ■  y.  If  P  is  Lipschitz 
continuous  in  some  sufficiently  large  ball  U(y,p),  that  is, 

(6.6)  P  (x)  -  p  (z)  ||  <_  a  ||  x  —  m  |t  *  x,z  t  U(y,p)  , 

then  it  follows  from  the  theorem  of  Banach  (35,  pp.  64-74)  that  the  sequence 

•  - 

generated  by  (6.5)  converges  to  a  solution  x  *  U(y,p),  provided 

(6.7)  p  >  Uxj-  xo||/(l  -  a)  . 

In  order  to  obtain  the  Lipschitz  constant  a,  automatic  differentiation 
may  be  used  to  compile  a  program  to  evaluate  the  n  *  n  matrix 
P'(x)  ■  (SP^/Sx^).  Using  interval  arithmetic,  the  same  program  will  compute 
the  extension  ♦ * (X)  on  the  interval  X  •  U(y,o)  defined  by  (6.3).  If  then 

(6.8)  a  -  | ♦ * (X)  |  <  1 

and  (6.7)  holds,  then  the  hypotheses  of  the  contraction  mapping  theorem  are 
satisfied.  This  computation ,  if  successful,  also  yields  the  error  bound 
II y  -  x  ||  for  y  as  an  approximate  solution. 

b.  The  Kantorovich  theorem  on  the  convergence  of  Newton's  method. 


in  vector  form.  The  theorem  of  Kantorovich  life)  requires  the  numbers 
B  ,  •  |||f'(y)l  1 1|  ,  yQ  1  ll*j  "  X0I|»  both  obtainable  rigorously  by  interval 
computation,  and  a  Lipschitz  constant  K  for  f'  on  some  sufficiently 
large  ball  X  U(y,p).  The  programs  described  in  110,11,23)  use  automatic 
differentiation  to  obtain  the  bilinear  operator  f"(x)  defined  by  (4.4). 
Evaluation  by  interval  arithmetic  gives  the  extension  F"(X)  and  the  desired 
Lipschitz  constant 

(6.10)  K  -  |F"(X)  |  . 

Now,  if 


(fe.ll) 


then  the  Kantorovich  theorem  guarantees  the  existence  of  x  such  that 

•  • 
f (x  )  ■  0  in  X,  the  convergence  of  the  sequence  generated  by  (6.9)  to  x  , 

and  provides  the  error  bound  i !  y  —  x  | ]  <_  p. 


c.  An  interval  Newton's  method.  One  may  obtain  an  interval  version  of 
Newton's  method  by  setting 


(6.12) 


X  ^  (m(X  ) 
n  n 


(F ' (X  )) 
n 


F (m(X  )) ) 
n 


n  ■  0,1,2,...,  where  m(X  )  Is  the  midpoint  of  the  interval  X  ,  as 

n  n 

described  by  Nickel  131)  and  Moore  (27).  Here,  X^  can  be  an  arbitrary 

interval,  not  necessarily  a  ball,  and  the  condition  X  ,  C  X  implies  the 

#  n*l  n 

existence  of  a  solution  x  t  X  .  Once  again,  automatic  differentiation  and 

n 

interval  arithmetic  are  used  to  obtain  F' (X  ) ,  which  is  then  inverted  by 

n 

interval  methods.  Inversion  of  interval  matrices  is  a  tedious  process,  how¬ 
ever,  so  (6.12)  is  of  little  practical  importance.  This  computation  is 
available  as  an  option  in  the  program  of  Kuba  and  Rail  1231. 

d.  Moore's  theorem.  This  is  an  interval  theorem,  based  on  the  Krawczyk 
transformation 

(6.13)  K (X)  -  y  -  Yf (y)  ♦  { I  -  YF'(X))(X  -  y) 


of  an  arbitrary  interval  X  (22).  In  (6.13),  y  c  X  and  Y  is  an  arbitrary, 
nonsingular  real  (not  interval)  matrix.  Once  again,  K(X)  is  computed 
by  automatic  differentiation  and  interval  arithmetic,  and  has  been  implemented 
as  an  option  in  the  program  (23)  by  Julia  Gray.  If  K(X)  C  X,  then  Moore's 
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theorem  (29)  guarantees  the  existence  of  a  solution  x  (  X,  which,  as  in 
the  previous  method,  also  yields  error  bounds  for  x  -  m(X)  or  any  other 
approximate  solution.  Kail  136)  has  shown  that  the  Kantorovich  theorem  has 
little  theoretical  advantage  over  Moore's  theorem,  while  the  latter  is  much 
less  costly  to  apply.  Hence,  the  elaborate  implementation  of  the  calcula¬ 
tion  of  (6.10)  in  the  programs  (11,23)  can  be  discarded. 

It  should  be  pointed  out  that  the  purpose  of  the  methods  and  programs 
described  in  this  section  is  not  to  solve  equations,  but  rather  to  guarantee 
existence  of  solutions  and  give  rigorous  error  bounds  for  approximate  solu¬ 
tions.  This  type  of  computation  is  expensive,  but  may  be  justifiable  in 
certain  applications,  such  as  the  design  of  critical  components  of  aircraft. 

7 .  Automation  of  estimation  of  error  of  numerical  integration  and  other 
techniques  of  classical  numerical  analysis.  In  ordinary  applications  of 
numerical  analysis,  error  estimation  is  a  tedious  chore  which  should  be 
mechanized  as  much  as  possible.  As  errors  in  data,  roundoff  error,  and 
truncation  error  contaminate  almost  every  actual  computation,  some  analysis 
of  their  effects  is  required  in  order  to  judge  the  reliability  of  the  results 
obtained  Interval  arithmetic  is  suitable  for  keeping  track  of  data  and 
roundoff  errors  automatically.  If  expressions  for  the  truncation  error  are 
known  in  terms  of  derivatives,  as  in  classical  numerical  analysis,  then 
automatic  differentiation  can  be  used  in  connection  with  interval  arithmetic 
to  obtain  bounds  for  truncation  error.  All  of  this  is  illustrated  aptly  by 
the  work  of  Moore  on  ordinary  differential  equations  (3,25,26,27,28,30), 
and  by  the  following  examples  from  ordinary  numerical  analysis. 

The  processes  of  interpolation  and  numerical  integration  and  differentia¬ 
tion  can  be  regarded  as  coming  from  sene  linear  functional  A  applied  to  the 
function  f  under  consideration.  Thus,  one  writes 


(7.1) 


Af 


n  ,  P 

-  [  a,f(x.)  4  (-)  t(C).  C  «  la.bl  , 


l-l 


where  the  remainder  term  t(£)  is  some  linear  combination  of  derivatives 
of  f  at  C,  for  example. 


(7.2) 


t(E)  -  Cpf (P_1) (C)  . 
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is  known.  The  linear  combination  of  values  of  f. 


is  usually  .ailed  the  rule  of  numerical  interj>olation,  integration,  or 
differentiation,  etc.  Taking  an  interval  extension  of  (7.1)  gives,  for 
X  -  (a.b). 


(7.4) 


white  the  interval  on  the  riqht  can  be  computed  automatically  by  use  of  the 
software  described  in  this  paixtr.  Thus,  this  furnishes  a  complete  error 
analysis  of  the  use  of  the  rule  Rf  given  by  (7.3)  in  order  to  calculate  the 
functional  Af.  This  type  of  analysis  is  carried  out  in  the  programs  of 
Gray  and  Rail  [12,13,14)  for  bounding  the  error  of  various  rules  for  the 
numerical  integration  of  functions  of  a  single  variable.  For  example,  for 
Simpson's  rule,  the  form  of  (7.4)  is 


f  (x) dx 


‘LiA'ki 

6 


(w [ a  ,b) ) 
2880 


Y1V(l*,b]) 


where  w[a,b)  ■  b  -  a  is  the  width  of  the  interval  [a,bl. 

Although  the  speed  of  the  computer  blurs  the  distinction  between  a  priori 
and  a  posteriori  error  estimations,  an  expression  of  the  form  can  furnish 
cither.  For  example,  (7.4)  can  be  computed  for  a  small  value  of  n,  and 
then  its  width  can  be  extrapolated  to  larger  values  of  n,  as  the  width  of 
T (X !  from  the  original  computat ion  will  always  be  an  upper  bound  for 
T(X  ),  C  x.  This  method  for  optimization  of  time  or  accuracy  in 
numeric  il  integration  is  implemented  in  the  program  (13).  A  posteriori  error 
atimatea,  of  course,  are  always  available  directly  from  the  computation  of 
(7.4).  Once  it  has  been  established  that  Af  e  [c.d]  ,  one  may  take  as  an 
estimate  for  Af  one  of  the  numbers 


m(a,b)  * 


a  *  b 


h[a,b]  » 


a  ♦  b  ' 


g(a,b)  -  *'ab  , 


h  minimize  the  absolute  error,  relative  error,  or  maximizes  the  relative 
pt';cision  in  tnc  sense  of  Olver  (32),  respectively.  Interval  results  can  be 
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* 


intersected,  if  more  than  one  is  available,  to  obtain  a  possible  improve¬ 
ment  (12,13) . 

As  an  example  of  the  application  of  this  program,  the  region  to  the  left 
and  below  of  th<  staircase  in  Figure  7.1  shows  where  S-place  accuracy  for  the 
calculation  of  the  elliptic  integral  of  first  kind 

1 

^  -  — 

(7.7)  F (k , 0)  -  /  (1  -  k2sin20>  2 d0 

0 

can  be  guaranteed,  using  Simpson's  rule  with  7  nodes  (three  times  on  [0,v]). 
This  guarantee  includes  roundoff  on  the  UNIVAC  1110  in  single  precision 
(to  about  9  decimal  digits),  and  should  be  valid  for  calculators  carrying 
more  places.  A  result  of  this  kind  might  be  useful  in  the  design  of  software 
for  programmab le  hand  calculators,  however,  the  important  point  is  that,  the 
results  shown  in  Figure  7.1  were  obtained  interactively  at  a  computer  terminal. 
The  user  supplied  the  integrand  of  (7.7),  and  then  various  values  of  k  and 
».  The  detailed  error  analysis  was  carried  out  by  the  computer,  using  the 
program  (13). 

The  error  of  numerical  differentiation  can  be  analyzed  in  the  same  way. 

It  may  seem  strange  to  use  an  analytic  differentiator  for  this  purpose;  how¬ 
ever,  it  may  turn  out  to  be  cheaper  or  more  suitable  to  the  input  data  to 
calculate  derivatives  as  linear  combinations  of  function  values  than  to 
execute  the  differentiation  subroutine,  once  the  rule  for  numerical  analysir. 
has  been  established  to  be  of  sufficient  accuracy  for  the  functions  considered. 
Here  again,  the  elaborate  differentiation  and  interval  software  is  used  for 
error  analysis  in  the  design  of  a  program  for  production  computation,  not  for 
the  computation  itself. 

8.  Conclusions  and  predictions.  The  above  illustrations  give  an  idea  of  the 
power  and  usefulness  of  software  for  automatic  differentiation,  however,  in 
spite  of  its  portability,  it  is  not  widely  accepted  at  the  present  time  by 
numerical  analysts.  One  hears  objections  such  as:  (i)  It  won't  work; 

(ii)  it  won't  work  well;  or  (iii)  a  problem  is  known  to  which  it  does  not 
apply.  Objections  (i)  and  (ii)  ,  coming  from  people  who  use  formula  translator  - 
routinely,  are  demolished  by  the  facts  that  differentiators  are  based  on  the 
same  principles  as  translators  and  have  been  used  with  success  at  the 
Mathematics  Research  Center  and  elsewhere  for  more  than  a  decade.  Objec" 
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P(#,k)  by  Slapaon'a  rula  with  7  nodaa. 


(in)  is  simply  a  negative,  nonconstructive  remark  which  applies  to  any  method 
or  tool.  A  more  positive  approach  would  be  to  think  of  problems  to  which  this 
software  does  apply,  as  the  examples  given  above  indicate. 

There  are, of  course,  alternatives  to  the  use  of  automatic  differentiators. 
One  is  to  derive  and  code  all  needed  derivatives  by  hand.  This  is  a  dull, 

routine  task,  highly  prone  to  error.  Thus,  to  make  as  much  use  of  the  speed 

and  accuracy  of  the  computer  as  possible,  such  mechanical  jobs  should  be 
turned  over  to  it.  The  other  alternative  is  to  avoid  the  use  of  derivatives 
and  Taylor  series  altogether.  This  has  been  done  with  great  success  in  a 
number  of  areas,  and  the  resulting  finite  difference  methods  have  enriched  both 
the  theory  and  practice  of  numerical  analysis.  However,  there  is  a  long 
tradition  of  the  use  of  series  and  derivatives  in  numerical  analysis,  dating 
back  at  least  to  the  1730  book  by  Stirling  (43).  Rather  than  turn  one's  back 

on  this  long  history  of  development,  it  would  seem  to  be  better  to  make  as 

much  use  of  the  power  of  mathematical  analysis  as  possible,  and  employ  deriva¬ 
tives  and  series  where  appropriate. 

Another  point  brought  up  in  this  paper  is  the  close  relationship  in 
applications  between  automatic  differentiation  and  interval  arithmetic,  even 
though  they  are  quite  different  in  concept  and  implementation.  The  reason  for 
this  is  that  in  floating-point  (real)  arithmetic,  the  derivative  can  often  be 
successfully  approximated  by  the  difference  quotient 

(8.1)  f  *  (x)  -  f  (x  4  -  ,  . 

h 

Numerical  analysts  working  with  finite  derivatives,  however,  know  that  h 
cannot  be  taken  too  small  in  (8.1),  as  the  significant  digits  in  f (x  ♦  h) 
and  f (x)  will  cancel  out,  and  the  resulting  roundoff  error  will  be  multiplied 
by  the  huge  number  1/h  to  give  a  result  of  no  significance  whatever.  It  is 
known  that  if  f(x)  can  be  calculated  with  absolute  precision  n,  then 
h  -  is  as  small  as  h  should  be  taken.  Thus,  a  somewhat  inaccurate  value 

of  the  derivative  is  the  price  paid  to  keep  the  computation  meaningful.  How¬ 
ever,  this  type  of  inaccuracy  is  of  no  consequence  in  many  applications.  In 
interval  arithmetic,  however,  one  has 

(8.2)  [0, 1 J  -  10,1]  -  1-1,11  » 

so  cancellation  is  replaced  by  a  widening  of  intervals,  which  is  only 
aggravated  by  multiplying  by  large  numbers.  This  "painful  honesty"  of  interval 
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arithmoti  iadk.es  ,  r-  i «  uitoble  for  use  with  derivatives  than  difference 
quotient  approx ijtiat ions  .uch  as  (8.1). 

From  an  historical  standpoint,  formula  translators  did  not  come  into 
general  use  until  hardware  for  floating-point  arithmetic  became  widely  avail¬ 
able.  Then,  one  could  declare  a  variable  to  be  "real”,  and  forget  the  scaling, 
etc.  which  is  better  done  in  machine  or  assembly  lanquage  for  fixed-point 
computat ions .  Because  of  the  intimate  relationship  between  interval  analysis 
and  derivatives,  one  may  have  to  wait  for  hardware  for  interval  arithmct ic 
befon  differentiation  by  software  is  widely  accepted. 

Finally,  the  present  bleed  of  automatic  differentiators,  as  all  human 
products,  can  be  improved.  At  present,  this  seems  to  be  a  task  which  is  too 
exotic  for  numerical  analysts  and  too  mundane  for  computer  scientists.  How¬ 
ever,  as  with  formula  translators,  one  can  expect  improvements  to  come  with 
use  and  coupled  with  advances  in  hardware  and  computer  science. 
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At  present,  software  for  formula  translation  is  used  routinely  in 
LV'-rical  computation.  On  the  other  hand,  although  software  for  differentia' 
t  r.  f  formulas  is  easy  to  construct  and  widely  available,  many  numerical 
l  ilyits  seem  to  be  unaware  of  its  existence  and  potential  for  numerical 
commutation.  A  simple  procedure  for  formula  translation  and  differentiation 
•  '  be  described,  and  some  significant  applications  will  be  indicated.  In 
-d  nary  computation,  these  include  solution  of  ordinary  and  partial  differ¬ 
ent  il  equations  by  ser  tes  methods  (Taylor  and  Lie  series,  for  example). 
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solution  of  nonlinear  systems  of  equations  by  Newton's  method  and  its  variants, 
and  nonlinear  optimization  (constrained  and  unconstrained).  Together  with 
interval  analysis,  differentiation  can  be  used  to  determine  properties  of  func¬ 
tions  and  thus  automate  the  application  of  certain  theorems,  such  as  ones  for 
existence  of  fixed  jxnnts  of  n-dimensional  operators  or  solutions  of  nonlinear 
systems  of  equations.  Another  large  field  of  application  of  differentiation 
is  to  automatic  error  analysis,  either  using  the  graph  structure  of  the  computa¬ 
tion,  or  interval  analysis.  An  example  is  given  of  a  program  for  numerical 
integration  in  which  automatic  differentiation  and  interval  analysis  are  combined 
t.  provide  a  priori  and  a  ;osteriori  error  bounds  for  the  results. 


